Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added to_words2 function to convert from number to spanish words #7

Closed
wants to merge 1 commit into from

Conversation

PavoDive
Copy link

A function to_words2() was created to overcome some limitations and errors of to_words.

to_words2 can take a number up to 1e22 or a string of digits up to 60 digits long, and convert it to words, thus expanding the capabilities of to_words.

It also fixes some errors of to_words, particularly the conversion of quantities ending in "millions", that were converted to "millones y mil" by to_words().

devtools::document() was run in order to update the documentation of the package, but some adjustments may be needed in the original functions, as the following error was displayed:

✖ spanish.R:28: @docType "package" is deprecated.
ℹ Please document "_PACKAGE" instead.

A function to_words2() was created to overcome some limitations and
errors of to_words.

to_words2 can take a number up to 1e22 or a string of digits up to 60
digits long, and convert it to words, thus expanding the capabilities
of to_words.

It also fixes some errors of to_words, particularly the conversion of
quantities ending in "millions", that were converted to "millones y
mil" by to_words().

devtools::document() was run in order to update the documentation of
the package, but some adjustments may be needed in the original
functions, as the following error was displayed:

   ✖ spanish.R:28: `@docType "package"` is deprecated.
   ℹ Please document "_PACKAGE" instead.
@PavoDive
Copy link
Author

PavoDive commented Nov 10, 2024

Hi Jose Manuel,

Please consider this tests:


## This file is made to compare the output of functions
## spanish::to_words() and to_words2() on some numbers.

values <- c(1:99,
            seq(100, 900, 100),
            seq(1000, 9000, 1000),
            seq(1e4, 9e4, 1e4),
            seq(1e5, 9e5, 1e5),
            seq(1e6, 9e6, 1e6),
            seq(1e7, 9e7, 1e7),
            seq(1e8, 9e8, 1e8),
            seq(1e9, 9e9, 1e9),
            seq(1e10, 9e10, 1e10),
            seq(1e11, 9e11, 1e11),
            seq(1e12, 9e12, 1e12),
            seq(1e13, 9e13, 1e13),
            seq(1e14, 9e14, 1e14),
            seq(1e15, 9e15, 1e15),
            seq(1e16, 9e16, 1e16),
            seq(1e17, 9e17, 1e17),
            seq(1e18, 9e18, 1e18),
            seq(1e19, 9e19, 1e19),
            seq(1e20, 9e20, 1e20),
            seq(1e21, 9e21, 1e21),
            seq(1e22, 9e22, 1e22), ## from here some arbitrary numbers:
            102,
            1003,
            1030,
            10004,
            10040,
            10400,
            100005,
            100050,
            100500,
            105000,
            150000,
            1000006,
            1000060,
            1000600,
            1006000,
            1060000,
            1600000,
            31415927, # pi*1e7
            271828183 # exp(1)*1e8
            )

df = data.frame(values = values)

df$string_rep = format(df$values, scientific = FALSE)

df$to_words = lapply(
    df$values,
    function(x) {
        ifelse(x > 999999999,
               as.character(NA),
               spanish::to_words(x)
               )
    }
)

df$to_words2 = lapply(df$values, to_words2)

## OBSERVED ERRORS AND NOTES:
## spanish::to_words adds 1000 to all numbers ending in "millones": 
##      20e6 --> "veinte millones mil"; 5e6 --> cinco millones mil.
## spanish::to_words adds untrimed spaces to the right of the words.
## spanish::to_words has two spaces between words: 900e8 --> 
##     "novecientos  millones mil ".
## spanish::to_words stops working just before 1e9 (mil millones).
## to_words2 fails above 4e22 (cuarenta mil trillones), but it's due to precision errors in the 
##    machine, as can be seen in the string representation of the number.
## spanish::to_words represents wrongly the numbers 1000006, 1000060 and 10006000, 
##    adding 1000 to each of them: "un millón mil y seis"

## TEXT INPUT
## In order to circunvent the precision errors for big numbers, to_words2() can
## take strings of numbers as input:

set.seed(77)
## a number with 22 digits:
j = sapply(1:22,
           function(x) as.character(sample(0:9, 1))) |>
    paste(collapse = "")
print(j)
to_words2(j)

## > [1] "1484757575318806025265"
## > [1] "mil cuatrocientos ochenta y cuatro trillones setecientos cincuenta y siete mil 
##              quinientos setenta y cinco billones trescientos diez y ocho mil ochocientos y seis 
##              millones veinte y cinco mil doscientos sesenta y cinco"

## a number with 60 digits:
j = sapply(1:60,
           function(x) as.character(sample(0:9, 1))) |>
    paste(collapse = "")
print(j)
to_words2(j)

## > [1] "212430083327760288597253958069365214126894321836306109691803"
## > [1] "doscientos doce mil cuatrocientos treinta nonillones ochenta y tres mil trescientos 
##              veinte y siete octillones setecientos sesenta mil doscientos ochenta y ocho 
##              septillones quinientos noventa y siete mil doscientos cincuenta y tres sextillones 
##              novecientos cincuenta y ocho mil sesenta y nueve quintillones trescientos 
##              sesenta y cinco mil doscientos catorce cuatrillones ciento veinte y seis mil 
##              ochocientos noventa y cuatro trillones trescientos veinte y un mil ochocientos 
##              treinta y seis billones trescientos y seis mil ciento y nueve millones seiscientos 
##              noventa y un mil ochocientos y tres"

@PavoDive PavoDive closed this Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant